EDA-Coursera-wk2

Lattice Plotting System (part 1)

Lattice1

xyplot, bwplot, levelplot, grid graphics, stripplot, dotplot, splom, contourplot, histograms

Lattice requires one to make the plot “all at once.”

Simple xyplot

  • xyplot(y ~ x | f * g, data)

Example #1

library(lattice)
library(datasets)

xyplot(Ozone ~ Wind, data = airquality)

5 Panel factor xyplot

airquality <- transform(airquality, 
                        Month = factor(Month))
xyplot(Ozone ~ Wind | Month, 
       data = airquality, 
       layout = c(5,1))

# Plots May to Sept by factor(month)

Lattice Behavior - Lattice returns an object of the class trellis - auto-prints

Saving plots to workspace

p <- xyplot(Ozone ~ Wind, data = airquality) # Nothing prints
print(p) # Now it prints

Lattice Plotting System (part 2)

set.seed(100)
x <- rnorm(100)
f <- rep(0:1, each = 50)
y <- x + f - f * x + rnorm(100, sd = 0.5)
f <- factor(f, labels = c("Grupo1", "Grupo2"))
xyplot(y ~ x | f, layout = c(2,1)) # plot 2 cols, 1 row

Plots w/ median lines

xyplot(y ~ x | f, panel = function(x,y, ...) {
    panel.xyplot(x,y, ...) # 1st call the default function for xyplot
    panel.abline(h = median(y), lty = 2) # Adds horizontal line at median
})

Plots w/ regression lines

xyplot(y ~ x | f, panel = function(x,y, ...) {
    panel.xyplot(x,y, ...) # 1st call the default function for xyplot
    panel.lmline(x, y, col = 3) # Adds regression lines
})

Example: See: Mouse Allergen & Asthma Cohort Study Ahluwalia etal, J. Allergy & Clinic. Immun., 2013

Summary:

  • Lattice req. one function call
  • Margins and Spacing are automatics and default values usually work
  • Lattice plots can be good for conditioning plots

ggplot2 part 1

part 1

gg = “grammar of graphics” by Leland Wilkinson, written by Hadley W as a graduate stud.

See: http://ggplot2.org for documentation

  • the grammar of graphics tells us that we have mapping from data to aesthetic attributes (color, shape, size) of geometry (points, lines, bars) drawn on a coordinate system.

Basic: qplot()

library(ggplot2)
head(mpg)
## # A tibble: 6 x 11
##   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
##   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
## 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
## 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
## 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
## 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
## 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
## 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…
str(mpg)
## tibble [234 × 11] (S3: tbl_df/tbl/data.frame)
##  $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
##  $ model       : chr [1:234] "a4" "a4" "a4" "a4" ...
##  $ displ       : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
##  $ drv         : chr [1:234] "f" "f" "f" "f" ...
##  $ cty         : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : chr [1:234] "p" "p" "p" "p" ...
##  $ class       : chr [1:234] "compact" "compact" "compact" "compact" ...

Simple ggplot2 example

qplot(x  =displ, y = hwy, data = mpg)

Modifying Aesthetics

qplot(x = displ, y = hwy, data = mpg, color = drv)

Adding a geom (Geometry)

qplot(x = displ, y = hwy, data = mpg, geom = c("point", "smooth"))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Histograms

qplot(hwy, data = mpg, fill = drv, bins = 40)

Facets

qplot(displ, hwy, data = mpg, facets = .~drv)

qplot(hwy, data = mpg, facets = drv ~., binwidth = 2)

Example from R.Peng - Mouse Allergen & Asthma study - Baltimore children (5-17) - Study indoor enviro and its relationship w asthma morbidity - See: http://goo.gl/Wqe9j8 - Partial data currently available publically

“simulated” maasc dataset

setwd("~/Dropbox/R_exercises/Coursera/Exploratory Data Analysis/week2")

load("maacs.Rda")
str(maacs)
## 'data.frame':    750 obs. of  9 variables:
##  $ id            : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ eno           : num  141 124 126 164 99 68 41 50 12 30 ...
##  $ duBedMusM     : num  2423 2793 3055 775 1634 ...
##  $ pm25          : num  15.6 34.4 39 33.2 27.1 ...
##  $ mopos         : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
##  $ logpm25       : num  1.19 1.54 1.59 1.52 1.43 ...
##  $ NocturnalSympt: int  0 0 2 2 2 2 0 1 0 0 ...
##  $ bmicat        : Factor w/ 2 levels "normal weight",..: 1 2 2 1 1 1 2 2 2 1 ...
##  $ logno2_new    : num  1.62 1.88 1.71 1.46 1.29 ...
qplot(log(eno), data = maacs)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 108 rows containing non-finite values (stat_bin).

qplot(log(eno), data = maacs, fill = mopos)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 108 rows containing non-finite values (stat_bin).

Density Smooth Plots

qplot(log(eno), data = maacs, geom = "density")

Density Smooth Plots w/ Color

qplot(log(eno), data = maacs, geom = "density", color = mopos)

Scatter plot1: eNO vs. PM25

qplot(log(pm25), log(eno), data = maacs)

Scatter plot2: eNO vs. PM25

qplot(log(pm25), log(eno), data = maacs, shape = mopos)

Scatter plot3: eNO vs. PM25

qplot(log(pm25), log(eno), data = maacs, color = mopos)
## Warning: Removed 184 rows containing missing values (geom_point).

Scatter plots w Smoothing and Linear Modeling

qplot(log(pm25), log(eno), data = maacs, color = mopos) +
      geom_smooth(method = "lm")

Splitting data with facets

qplot(log(pm25), 
      log(eno), 
      data = maacs, 
      color = mopos,
      facets = . ~ mopos) +
      geom_smooth(method = "lm")

NOTE: For ggplot2; It is difficult to go against the grain or customize(so don’t bother) OR learn/use full ggplo2 methods with more complicated methods.

ggplot2 (part 3)

Fundamentals only - Plots are built up in layers - data frame - aesthetic mappings: color, size - geoms: points, lines and shapes - facets: for conditional plots - stats: transformations, i.e. binning, quantiles, smoothing - scales: factors like m/f - coordinate system

Basic plotting

qplot(log(pm25), 
      NocturnalSympt, 
      data = maacs, 
      facets = .~bmicat, 
      geom = c("point", "smooth"), 
      method = "lm")

Building up layers

head(maacs[1:3,1:3])
##   id eno duBedMusM
## 1  1 141      2423
## 2  2 124      2793
## 3  3 126      3055
g <- ggplot(maacs, aes(log(pm25), NocturnalSympt))
print(g)

summary(g)
## data: id, eno, duBedMusM, pm25, mopos, logpm25, NocturnalSympt, bmicat,
##   logno2_new [750x9]
## mapping:  x = ~log(pm25), y = ~NocturnalSympt
## faceting: <ggproto object: Class FacetNull, Facet, gg>
##     compute_layout: function
##     draw_back: function
##     draw_front: function
##     draw_labels: function
##     draw_panels: function
##     finish_data: function
##     init_scales: function
##     map_data: function
##     params: list
##     setup_data: function
##     setup_params: function
##     shrink: TRUE
##     train_scales: function
##     vars: function
##     super:  <ggproto object: Class FacetNull, Facet, gg>
p <- g + geom_point()
print(p)

OR

g + geom_point()

ggplot2 (part 4)

Adding More Layers: Smoothing

g + geom_point() +
    geom_smooth()

g + geom_point() +
    geom_smooth(method = "lm")

Adding Facets - Conditional Vars

g + geom_point() +
    facet_grid(. ~ bmicat) +
    geom_smooth(method = "lm")

Annotations

  • xlab(), ylab, labs, ggtitle
  • each geom function has its own options too
  • For global issues USE: theme()
    • `theme(legend.position = “none”)
  • Two std themes include
    • theme_gray()
    • theme_bw()

Modifying Aesthetics (colors)

g + geom_point(color = "lightblue", size = 4, alpha = 0.75) +
    geom_smooth(method = "lm")

g + geom_point(aes(color = bmicat), size = 4, alpha = 0.5) +
    geom_smooth(method = "lm")

Mod-ing Labels AND Smoothing

g + geom_point(aes(color = bmicat), size = 3, alpha = 0.5) + 
                   labs(title = "MAACS Study") + 
                   labs(x = "log PM25",
                        y = "Nocturnal Symptons") +
    geom_smooth(method = "lm", se = TRUE)

Changing Themes

g + geom_point(aes(color = bmicat), size = 4, alpha = 0.5) +
    theme_bw(base_family = "Times") +
    geom_smooth(method = "lm")

ggplot2 (part 5)

Notes about the Axis Limits

Base Plot

testdata <- data.frame(x = 1:100, y = rnorm(100))
testdata[50,2] <- 100 # Giant Outlier
plot(testdata$x, testdata$y, type = "l")

GGPlot2

g <- ggplot(testdata, aes(x=x, y=y)) +
     geom_line()
print(g)

WRONG Way

This subsets the data to ONLY include data within the limits NOT using the 50,2 datapoint.

g + ylim(-3, 3)

Better Way

This shows point including outlier with lines proceeding outside the chart limits

g + coord_cartesian(ylim = c(-3, 3))

g <- ggplot(mpg,aes(x=displ,y=hwy,color=factor(year)))
g + geom_point()+facet_grid(drv~cyl, margins = TRUE)

g + geom_point()+facet_grid(drv~cyl, margins = TRUE)+geom_smooth(method="lm", se=FALSE,size=2, color="black")
## `geom_smooth()` using formula 'y ~ x'

g + geom_point()+facet_grid(drv~cyl, margins = TRUE)+geom_smooth(method="lm", se=FALSE,size=2, color="black")+labs(x = "Displacement", y= "Highway Mileage", title = "Swirl Rules!")
## `geom_smooth()` using formula 'y ~ x'

An Other Example by R.Peng

Cutting N.O. into RANGES

# Calculate the deciles of the data
cutpoints <- quantile(maacs$logno2_new, seq(0,1, length = 4), na.rm = TRUE)

# Cut the data into deciles
maacs$no_decile <- cut(maacs$logno2_new, cutpoints)

# See the levels of the newly created factor var
levels(maacs$no_decile)
## [1] "(-0.629,1.18]" "(1.18,1.44]"   "(1.44,2.48]"
# Setup ggplot with data frame
g <- ggplot(maacs, aes(log(pm25), NocturnalSympt))

# Adding Layers
g + geom_point(alpha = 1/3) +
    facet_wrap(bmicat ~ no_decile, nrow = 2, ncol = 4) +
    geom_smooth(method = "lm", se = FALSE, col = 2) +
    theme_bw(base_family = "Times", base_size = 12) +
    labs(x = "log PM25") +
    labs(y = "Nocturnal Symptons") +
    labs(title = "MAACS Study")

Practical R Exercises in swirl

Practical R Exercises in swirl Part 2

- Duration: 10 minutes
  1. swirl Lesson 1: Lattice Plotting System
    • Duration: ~3 hours
  2. swirl Lesson 2: Working with Colors
    • Duration: ~3 hours
  3. swirl Lesson 3: GGPlot2 Part1
    • Duration: ~3 hours
  4. swirl Lesson 4: GGPlot2 Part2
    • Duration: ~3 hours
  5. swirl Lesson 5: GGPlot2 Extras
    • Duration: ~3 hours

EDA Lesson 6: Lattice Plotting System

library(lattice)
library(ggplot2)

Example: xyplot(y ~ x | f * g, data)

head(airquality)
##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3
## 4    18     313 11.5   62     5   4
## 5    NA      NA 14.3   56     5   5
## 6    28      NA 14.9   66     5   6
xyplot(Ozone ~ Wind, data = airquality)

xyplot(Ozone ~ Wind, data = airquality, col = "red", pch = 8, main = "Big Apple Data")

xyplot(Ozone ~ Wind | as.factor(Month), data = airquality, layout = c(5,1))

xyplot(Ozone ~ Wind | Month, data = airquality, layout = c(5,1))

p <- xyplot(Ozone~Wind,data=airquality)
print(p)

names(p) # 45 named properties
##  [1] "formula"           "as.table"          "aspect.fill"      
##  [4] "legend"            "panel"             "page"             
##  [7] "layout"            "skip"              "strip"            
## [10] "strip.left"        "xscale.components" "yscale.components"
## [13] "axis"              "xlab"              "ylab"             
## [16] "xlab.default"      "ylab.default"      "xlab.top"         
## [19] "ylab.right"        "main"              "sub"              
## [22] "x.between"         "y.between"         "par.settings"     
## [25] "plot.args"         "lattice.options"   "par.strip.text"   
## [28] "index.cond"        "perm.cond"         "condlevels"       
## [31] "call"              "x.scales"          "y.scales"         
## [34] "panel.args.common" "panel.args"        "packet.sizes"     
## [37] "x.limits"          "y.limits"          "x.used.at"        
## [40] "y.used.at"         "x.num.limit"       "y.num.limit"      
## [43] "aspect.ratio"      "prepanel.default"  "prepanel"
p[["formula"]]
## Ozone ~ Wind
p[["x.limits"]]
## [1]  0.37 22.03
table(f)
## f
## Grupo1 Grupo2 
##     50     50
xyplot(y~x|f, layout = c(2,1))

The panel function has 3 arguments, x, y and … .

p <- xyplot(y ~ x | f, panel = function(x, y, ...) {
  panel.xyplot(x, y, ...)  ## First call the default panel function for 'xyplot'
  panel.abline(h = median(y), lty = 2)  ## Add a horizontal line at the median
})
print(p)

invisible()

Diamonds

str(diamonds)
## tibble [53,940 × 10] (S3: tbl_df/tbl/data.frame)
##  $ carat  : num [1:53940] 0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
##  $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
##  $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
##  $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
##  $ depth  : num [1:53940] 61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
##  $ table  : num [1:53940] 55 61 65 58 58 57 57 55 61 61 ...
##  $ price  : int [1:53940] 326 326 327 334 335 336 336 337 337 338 ...
##  $ x      : num [1:53940] 3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
##  $ y      : num [1:53940] 3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
##  $ z      : num [1:53940] 2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
table(diamonds$color)
## 
##     D     E     F     G     H     I     J 
##  6775  9797  9542 11292  8304  5422  2808
table(diamonds$color, diamonds$cut)
##    
##     Fair Good Very Good Premium Ideal
##   D  163  662      1513    1603  2834
##   E  224  933      2400    2337  3903
##   F  312  909      2164    2331  3826
##   G  314  871      2299    2924  4884
##   H  303  702      1824    2360  3115
##   I  175  522      1204    1428  2093
##   J  119  307       678     808   896

Review

True or False? Lattice plots are constructed by a series of calls to core functions.

1: True 2: False

Selection: 2

Excellent work!
True or False? Lattice plots are constructed with a single function call to a core lattice function (e.g.
xyplot)

1: True 2: False

Selection: 1

All that hard work is paying off!
True or False? Aspects like margins and spacing are automatically handled and defaults are usually
sufficient.

1: False 2: True

Selection: 2

Perseverance, that’s the answer.
True or False? The lattice system is ideal for creating conditioning plots where you examine the same kind
of plot under many different conditions.

1: True 2: False

Selection: 1

You are amazing!
True or False? The lattice system, like the base plotting system, returns a trellis plot object.

1: True 2: False

Selection: 2

That’s a job well done!
True or False? Panel functions can NEVER be customized to modify what is plotted in each of the plot
panels.

1: False 2: True

Selection: 1

You nailed it! Good job!
True or False? Lattice plots can display at most 20 panels in a single plot.

1: False 2: True

Selection: 1

EDA Lesson 2: Working with Colors

04_ExploratoryAnalysis/Colors

library(jpeg)
library(RColorBrewer)
library(datasets)

colors()
##   [1] "white"                "aliceblue"            "antiquewhite"        
##   [4] "antiquewhite1"        "antiquewhite2"        "antiquewhite3"       
##   [7] "antiquewhite4"        "aquamarine"           "aquamarine1"         
##  [10] "aquamarine2"          "aquamarine3"          "aquamarine4"         
##  [13] "azure"                "azure1"               "azure2"              
##  [16] "azure3"               "azure4"               "beige"               
##  [19] "bisque"               "bisque1"              "bisque2"             
##  [22] "bisque3"              "bisque4"              "black"               
##  [25] "blanchedalmond"       "blue"                 "blue1"               
##  [28] "blue2"                "blue3"                "blue4"               
##  [31] "blueviolet"           "brown"                "brown1"              
##  [34] "brown2"               "brown3"               "brown4"              
##  [37] "burlywood"            "burlywood1"           "burlywood2"          
##  [40] "burlywood3"           "burlywood4"           "cadetblue"           
##  [43] "cadetblue1"           "cadetblue2"           "cadetblue3"          
##  [46] "cadetblue4"           "chartreuse"           "chartreuse1"         
##  [49] "chartreuse2"          "chartreuse3"          "chartreuse4"         
##  [52] "chocolate"            "chocolate1"           "chocolate2"          
##  [55] "chocolate3"           "chocolate4"           "coral"               
##  [58] "coral1"               "coral2"               "coral3"              
##  [61] "coral4"               "cornflowerblue"       "cornsilk"            
##  [64] "cornsilk1"            "cornsilk2"            "cornsilk3"           
##  [67] "cornsilk4"            "cyan"                 "cyan1"               
##  [70] "cyan2"                "cyan3"                "cyan4"               
##  [73] "darkblue"             "darkcyan"             "darkgoldenrod"       
##  [76] "darkgoldenrod1"       "darkgoldenrod2"       "darkgoldenrod3"      
##  [79] "darkgoldenrod4"       "darkgray"             "darkgreen"           
##  [82] "darkgrey"             "darkkhaki"            "darkmagenta"         
##  [85] "darkolivegreen"       "darkolivegreen1"      "darkolivegreen2"     
##  [88] "darkolivegreen3"      "darkolivegreen4"      "darkorange"          
##  [91] "darkorange1"          "darkorange2"          "darkorange3"         
##  [94] "darkorange4"          "darkorchid"           "darkorchid1"         
##  [97] "darkorchid2"          "darkorchid3"          "darkorchid4"         
## [100] "darkred"              "darksalmon"           "darkseagreen"        
## [103] "darkseagreen1"        "darkseagreen2"        "darkseagreen3"       
## [106] "darkseagreen4"        "darkslateblue"        "darkslategray"       
## [109] "darkslategray1"       "darkslategray2"       "darkslategray3"      
## [112] "darkslategray4"       "darkslategrey"        "darkturquoise"       
## [115] "darkviolet"           "deeppink"             "deeppink1"           
## [118] "deeppink2"            "deeppink3"            "deeppink4"           
## [121] "deepskyblue"          "deepskyblue1"         "deepskyblue2"        
## [124] "deepskyblue3"         "deepskyblue4"         "dimgray"             
## [127] "dimgrey"              "dodgerblue"           "dodgerblue1"         
## [130] "dodgerblue2"          "dodgerblue3"          "dodgerblue4"         
## [133] "firebrick"            "firebrick1"           "firebrick2"          
## [136] "firebrick3"           "firebrick4"           "floralwhite"         
## [139] "forestgreen"          "gainsboro"            "ghostwhite"          
## [142] "gold"                 "gold1"                "gold2"               
## [145] "gold3"                "gold4"                "goldenrod"           
## [148] "goldenrod1"           "goldenrod2"           "goldenrod3"          
## [151] "goldenrod4"           "gray"                 "gray0"               
## [154] "gray1"                "gray2"                "gray3"               
## [157] "gray4"                "gray5"                "gray6"               
## [160] "gray7"                "gray8"                "gray9"               
## [163] "gray10"               "gray11"               "gray12"              
## [166] "gray13"               "gray14"               "gray15"              
## [169] "gray16"               "gray17"               "gray18"              
## [172] "gray19"               "gray20"               "gray21"              
## [175] "gray22"               "gray23"               "gray24"              
## [178] "gray25"               "gray26"               "gray27"              
## [181] "gray28"               "gray29"               "gray30"              
## [184] "gray31"               "gray32"               "gray33"              
## [187] "gray34"               "gray35"               "gray36"              
## [190] "gray37"               "gray38"               "gray39"              
## [193] "gray40"               "gray41"               "gray42"              
## [196] "gray43"               "gray44"               "gray45"              
## [199] "gray46"               "gray47"               "gray48"              
## [202] "gray49"               "gray50"               "gray51"              
## [205] "gray52"               "gray53"               "gray54"              
## [208] "gray55"               "gray56"               "gray57"              
## [211] "gray58"               "gray59"               "gray60"              
## [214] "gray61"               "gray62"               "gray63"              
## [217] "gray64"               "gray65"               "gray66"              
## [220] "gray67"               "gray68"               "gray69"              
## [223] "gray70"               "gray71"               "gray72"              
## [226] "gray73"               "gray74"               "gray75"              
## [229] "gray76"               "gray77"               "gray78"              
## [232] "gray79"               "gray80"               "gray81"              
## [235] "gray82"               "gray83"               "gray84"              
## [238] "gray85"               "gray86"               "gray87"              
## [241] "gray88"               "gray89"               "gray90"              
## [244] "gray91"               "gray92"               "gray93"              
## [247] "gray94"               "gray95"               "gray96"              
## [250] "gray97"               "gray98"               "gray99"              
## [253] "gray100"              "green"                "green1"              
## [256] "green2"               "green3"               "green4"              
## [259] "greenyellow"          "grey"                 "grey0"               
## [262] "grey1"                "grey2"                "grey3"               
## [265] "grey4"                "grey5"                "grey6"               
## [268] "grey7"                "grey8"                "grey9"               
## [271] "grey10"               "grey11"               "grey12"              
## [274] "grey13"               "grey14"               "grey15"              
## [277] "grey16"               "grey17"               "grey18"              
## [280] "grey19"               "grey20"               "grey21"              
## [283] "grey22"               "grey23"               "grey24"              
## [286] "grey25"               "grey26"               "grey27"              
## [289] "grey28"               "grey29"               "grey30"              
## [292] "grey31"               "grey32"               "grey33"              
## [295] "grey34"               "grey35"               "grey36"              
## [298] "grey37"               "grey38"               "grey39"              
## [301] "grey40"               "grey41"               "grey42"              
## [304] "grey43"               "grey44"               "grey45"              
## [307] "grey46"               "grey47"               "grey48"              
## [310] "grey49"               "grey50"               "grey51"              
## [313] "grey52"               "grey53"               "grey54"              
## [316] "grey55"               "grey56"               "grey57"              
## [319] "grey58"               "grey59"               "grey60"              
## [322] "grey61"               "grey62"               "grey63"              
## [325] "grey64"               "grey65"               "grey66"              
## [328] "grey67"               "grey68"               "grey69"              
## [331] "grey70"               "grey71"               "grey72"              
## [334] "grey73"               "grey74"               "grey75"              
## [337] "grey76"               "grey77"               "grey78"              
## [340] "grey79"               "grey80"               "grey81"              
## [343] "grey82"               "grey83"               "grey84"              
## [346] "grey85"               "grey86"               "grey87"              
## [349] "grey88"               "grey89"               "grey90"              
## [352] "grey91"               "grey92"               "grey93"              
## [355] "grey94"               "grey95"               "grey96"              
## [358] "grey97"               "grey98"               "grey99"              
## [361] "grey100"              "honeydew"             "honeydew1"           
## [364] "honeydew2"            "honeydew3"            "honeydew4"           
## [367] "hotpink"              "hotpink1"             "hotpink2"            
## [370] "hotpink3"             "hotpink4"             "indianred"           
## [373] "indianred1"           "indianred2"           "indianred3"          
## [376] "indianred4"           "ivory"                "ivory1"              
## [379] "ivory2"               "ivory3"               "ivory4"              
## [382] "khaki"                "khaki1"               "khaki2"              
## [385] "khaki3"               "khaki4"               "lavender"            
## [388] "lavenderblush"        "lavenderblush1"       "lavenderblush2"      
## [391] "lavenderblush3"       "lavenderblush4"       "lawngreen"           
## [394] "lemonchiffon"         "lemonchiffon1"        "lemonchiffon2"       
## [397] "lemonchiffon3"        "lemonchiffon4"        "lightblue"           
## [400] "lightblue1"           "lightblue2"           "lightblue3"          
## [403] "lightblue4"           "lightcoral"           "lightcyan"           
## [406] "lightcyan1"           "lightcyan2"           "lightcyan3"          
## [409] "lightcyan4"           "lightgoldenrod"       "lightgoldenrod1"     
## [412] "lightgoldenrod2"      "lightgoldenrod3"      "lightgoldenrod4"     
## [415] "lightgoldenrodyellow" "lightgray"            "lightgreen"          
## [418] "lightgrey"            "lightpink"            "lightpink1"          
## [421] "lightpink2"           "lightpink3"           "lightpink4"          
## [424] "lightsalmon"          "lightsalmon1"         "lightsalmon2"        
## [427] "lightsalmon3"         "lightsalmon4"         "lightseagreen"       
## [430] "lightskyblue"         "lightskyblue1"        "lightskyblue2"       
## [433] "lightskyblue3"        "lightskyblue4"        "lightslateblue"      
## [436] "lightslategray"       "lightslategrey"       "lightsteelblue"      
## [439] "lightsteelblue1"      "lightsteelblue2"      "lightsteelblue3"     
## [442] "lightsteelblue4"      "lightyellow"          "lightyellow1"        
## [445] "lightyellow2"         "lightyellow3"         "lightyellow4"        
## [448] "limegreen"            "linen"                "magenta"             
## [451] "magenta1"             "magenta2"             "magenta3"            
## [454] "magenta4"             "maroon"               "maroon1"             
## [457] "maroon2"              "maroon3"              "maroon4"             
## [460] "mediumaquamarine"     "mediumblue"           "mediumorchid"        
## [463] "mediumorchid1"        "mediumorchid2"        "mediumorchid3"       
## [466] "mediumorchid4"        "mediumpurple"         "mediumpurple1"       
## [469] "mediumpurple2"        "mediumpurple3"        "mediumpurple4"       
## [472] "mediumseagreen"       "mediumslateblue"      "mediumspringgreen"   
## [475] "mediumturquoise"      "mediumvioletred"      "midnightblue"        
## [478] "mintcream"            "mistyrose"            "mistyrose1"          
## [481] "mistyrose2"           "mistyrose3"           "mistyrose4"          
## [484] "moccasin"             "navajowhite"          "navajowhite1"        
## [487] "navajowhite2"         "navajowhite3"         "navajowhite4"        
## [490] "navy"                 "navyblue"             "oldlace"             
## [493] "olivedrab"            "olivedrab1"           "olivedrab2"          
## [496] "olivedrab3"           "olivedrab4"           "orange"              
## [499] "orange1"              "orange2"              "orange3"             
## [502] "orange4"              "orangered"            "orangered1"          
## [505] "orangered2"           "orangered3"           "orangered4"          
## [508] "orchid"               "orchid1"              "orchid2"             
## [511] "orchid3"              "orchid4"              "palegoldenrod"       
## [514] "palegreen"            "palegreen1"           "palegreen2"          
## [517] "palegreen3"           "palegreen4"           "paleturquoise"       
## [520] "paleturquoise1"       "paleturquoise2"       "paleturquoise3"      
## [523] "paleturquoise4"       "palevioletred"        "palevioletred1"      
## [526] "palevioletred2"       "palevioletred3"       "palevioletred4"      
## [529] "papayawhip"           "peachpuff"            "peachpuff1"          
## [532] "peachpuff2"           "peachpuff3"           "peachpuff4"          
## [535] "peru"                 "pink"                 "pink1"               
## [538] "pink2"                "pink3"                "pink4"               
## [541] "plum"                 "plum1"                "plum2"               
## [544] "plum3"                "plum4"                "powderblue"          
## [547] "purple"               "purple1"              "purple2"             
## [550] "purple3"              "purple4"              "red"                 
## [553] "red1"                 "red2"                 "red3"                
## [556] "red4"                 "rosybrown"            "rosybrown1"          
## [559] "rosybrown2"           "rosybrown3"           "rosybrown4"          
## [562] "royalblue"            "royalblue1"           "royalblue2"          
## [565] "royalblue3"           "royalblue4"           "saddlebrown"         
## [568] "salmon"               "salmon1"              "salmon2"             
## [571] "salmon3"              "salmon4"              "sandybrown"          
## [574] "seagreen"             "seagreen1"            "seagreen2"           
## [577] "seagreen3"            "seagreen4"            "seashell"            
## [580] "seashell1"            "seashell2"            "seashell3"           
## [583] "seashell4"            "sienna"               "sienna1"             
## [586] "sienna2"              "sienna3"              "sienna4"             
## [589] "skyblue"              "skyblue1"             "skyblue2"            
## [592] "skyblue3"             "skyblue4"             "slateblue"           
## [595] "slateblue1"           "slateblue2"           "slateblue3"          
## [598] "slateblue4"           "slategray"            "slategray1"          
## [601] "slategray2"           "slategray3"           "slategray4"          
## [604] "slategrey"            "snow"                 "snow1"               
## [607] "snow2"                "snow3"                "snow4"               
## [610] "springgreen"          "springgreen1"         "springgreen2"        
## [613] "springgreen3"         "springgreen4"         "steelblue"           
## [616] "steelblue1"           "steelblue2"           "steelblue3"          
## [619] "steelblue4"           "tan"                  "tan1"                
## [622] "tan2"                 "tan3"                 "tan4"                
## [625] "thistle"              "thistle1"             "thistle2"            
## [628] "thistle3"             "thistle4"             "tomato"              
## [631] "tomato1"              "tomato2"              "tomato3"             
## [634] "tomato4"              "turquoise"            "turquoise1"          
## [637] "turquoise2"           "turquoise3"           "turquoise4"          
## [640] "violet"               "violetred"            "violetred1"          
## [643] "violetred2"           "violetred3"           "violetred4"          
## [646] "wheat"                "wheat1"               "wheat2"              
## [649] "wheat3"               "wheat4"               "whitesmoke"          
## [652] "yellow"               "yellow1"              "yellow2"             
## [655] "yellow3"              "yellow4"              "yellowgreen"
sample(colors(),10)
##  [1] "brown2"        "darkseagreen2" "beige"         "mistyrose4"   
##  [5] "mediumorchid3" "navajowhite2"  "purple4"       "royalblue"    
##  [9] "grey80"        "grey85"
pal <- colorRamp(c("red","blue"))
pal(0)
##      [,1] [,2] [,3]
## [1,]  255    0    0
pal(1)
##      [,1] [,2] [,3]
## [1,]    0    0  255
pal(seq(0,1,len=6))
##      [,1] [,2] [,3]
## [1,]  255    0    0
## [2,]  204    0   51
## [3,]  153    0  102
## [4,]  102    0  153
## [5,]   51    0  204
## [6,]    0    0  255
p1 <- colorRampPalette(c("red","blue"))
p1(2)
## [1] "#FF0000" "#0000FF"
p1(6)
## [1] "#FF0000" "#CC0033" "#990066" "#650099" "#3200CC" "#0000FF"
0xcc
## [1] 204
#hex 33 to decimal, as in 0x33=3*16+3=51
p2 <- colorRampPalette(c("red","yellow"))
p2(2)
## [1] "#FF0000" "#FFFF00"
p2(10)
##  [1] "#FF0000" "#FF1C00" "#FF3800" "#FF5500" "#FF7100" "#FF8D00" "#FFAA00"
##  [8] "#FFC600" "#FFE200" "#FFFF00"
p1
## function (n) 
## {
##     x <- ramp(seq.int(0, 1, length.out = n))
##     if (ncol(x) == 4L) 
##         rgb(x[, 1L], x[, 2L], x[, 3L], x[, 4L], maxColorValue = 255)
##     else rgb(x[, 1L], x[, 2L], x[, 3L], maxColorValue = 255)
## }
## <bytecode: 0x55a97d7d5170>
## <environment: 0x55a97d73e340>

function (n) { x <- ramp(seq.int(0, 1, length.out = n)) if (ncol(x) == 4L) rgb(x[, 1L], x[, 2L], x[, 3L], x[, 4L], maxColorValue = 255) else rgb(x[, 1L], x[, 2L], x[, 3L], maxColorValue = 255) } <bytecode: 0x559ae13e32d8> <environment: 0x559ae17a2fe0>

rgb {grDevices} - R Documentation - RGB Color Specification

?rgb

rgb((0:15)/15, green = 0, blue = 0, names = paste(“red”, 0:15, sep = “.”))

rgb(0, 0:12, 0, max = 255) # integer input

ramp <- colorRamp(c(“red”, “white”)) rgb( ramp(seq(0, 1, length = 5)), max = 255)

p3 <- colorRampPalette(c("blue", "green"), alpha = 0.5)
p3(5)
## [1] "#0000FFFF" "#003FBFFF" "#007F7FFF" "#00BF3FFF" "#00FF00FF"
# Package ‘RColorBrewer’ loaded correctly!
plot(x, y, pch = 19, col = rgb(0,0.5,0.5))

Alpha = 0.3 (30%)

plot(x, y, pch = 19, col = rgb(0,0.5,0.5, 0.3))

Our last topic for this lesson is the RColorBrewer Package, available on CRAN, that contains interesting and useful color palettes, of which there are 3 types, sequential, divergent, and qualitative. Which one you would choose to use depends on your data.

showMe <- function(cv){
  myarg <- deparse(substitute(cv))
  z<- outer( 1:20,1:20, "+")
  obj<- list( x=1:20,y=1:20,z=z )
  image(obj, col=cv, main=myarg  )
}
cols <- brewer.pal(3, "BuGn")

pal <- colorRampPalette(cols)

showMe(pal(20))

Where: showMe <- function(cv){ myarg <- deparse(substitute(cv)) z<- outer( 1:20,1:20, “+”) obj<- list( x=1:20,y=1:20,z=z ) image(obj, col=cv, main=myarg ) }

image(volcano, col = pal(20))

Which of the following is an R package that provides color palettes for sequential, | categorical, and diverging data?

1: RColorVintner 2: RColorBrewer 3: RColorBluer 4: RColorStewer

Selection: 2

Perseverance, that’s the answer.
True or False? The colorRamp and colorRampPalette functions can be used in conjunction with
color palettes to connect data to colors.

1: False 2: True

Selection: 2

That’s correct!
True or False? Transparency can NEVER be used to clarify plots with many points

1: True 2: False

Selection: 2

You are doing so well!
True or False? The call p7 <- colorRamp(“red”,“blue”) would work (i.e., not generate an
error).

1: False 2: True

Selection: 2

One more time. You can do it!
Recall our reminders to concatenate the colors to form a single argument.

1: False 2: True

Selection: 1

That’s a job well done!
True or False? The function colors returns only 10 colors.

1: False 2: True

Selection: 1

You’re the best!
Transparency is determined by which parameter of the rgb function?

1: beta 2: delta 3: alpha 4: it’s all Greek to me 5: gamma

Selection: 3

That’s correct!

Lesson 3: GGPlot2 Part1

GGPlot2_Part1. (Slides for this and other Data Science courses may be found at github

See: http://ggplot2.org

str(mpg)
## tibble [234 × 11] (S3: tbl_df/tbl/data.frame)
##  $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
##  $ model       : chr [1:234] "a4" "a4" "a4" "a4" ...
##  $ displ       : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
##  $ drv         : chr [1:234] "f" "f" "f" "f" ...
##  $ cty         : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : chr [1:234] "p" "p" "p" "p" ...
##  $ class       : chr [1:234] "compact" "compact" "compact" "compact" ...
qplot(displ, hwy, data=mpg)

qplot(displ, hwy, data=mpg) + aes(color = drv)

# OR

qplot(y=hwy, data = mpg, color = drv)

qplot(displ, hwy, data = mpg, color = drv, geom = c("point", "smooth"))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Box plots with GGplot2

qplot(drv, hwy, data=mpg, geom="boxplot")

Cool

qplot(drv,hwy,data=mpg,geom="boxplot",color=manufacturer)

qplot(hwy, data = mpg, fill = drv)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
qplot(displ, hwy, data=mpg,facets = .~drv)

qplot(hwy, data=mpg,facets = drv~., binwidth = 2)

The facets argument, drv ~ ., resulted in what arrangement of facets?

1: huh? 2: 2 by 2 3: 3 by 1 4: 1 by 3

Selection: 3

Keep working like that and you’ll get there!
Pretty good, right? Not too difficult either. Let’s review what we learned!

Which of the following is a basic workhorse function of ggplot2?

1: xyplot 2: hist 3: scatterplot 4: gplot 5: qplot

Selection: 5

Your dedication is inspiring!
Which types of plot does qplot plot?

1: histograms 2: scatterplots 3: all of the others 4: box and whisker plots

Selection: 3

Excellent work!
What does the gg in ggplot2 stand for?

1: grammar of graphics 2: good grief 3: goto graphics 4: good graphics

Selection: 1

Keep working like that and you’ll get there!
True or False? The geom argument takes a string for a value.

1: False 2: True

Selection: 2

You’re the best!
True or False? The data argument takes a string for a value.

1: False 2: True

Selection: 1

Keep up the great work!
True or False? The binwidth argument takes a string for a value.

1: False 2: True

Selection: 1

Nice work!
True or False? The user must specify x- and y-axis labels when using qplot.

1: True 2: False

Selection: 2

GGPlot2 Part2

qplot(displ, hwy, data = mpg, geom=c("point", "smooth"),facets=.~drv)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

g <- ggplot(mpg, aes(displ,hwy) ) 
g

summary(g)
## data: manufacturer, model, displ, year, cyl, trans, drv, cty, hwy, fl,
##   class [234x11]
## mapping:  x = ~displ, y = ~hwy
## faceting: <ggproto object: Class FacetNull, Facet, gg>
##     compute_layout: function
##     draw_back: function
##     draw_front: function
##     draw_labels: function
##     draw_panels: function
##     finish_data: function
##     init_scales: function
##     map_data: function
##     params: list
##     setup_data: function
##     setup_params: function
##     shrink: TRUE
##     train_scales: function
##     vars: function
##     super:  <ggproto object: Class FacetNull, Facet, gg>
g+geom_point()

(g + geom_point())+geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

g+geom_point()+geom_smooth(method="lm")
## `geom_smooth()` using formula 'y ~ x'

g+geom_point()+geom_smooth(method="lm")+facet_grid(.~drv)
## `geom_smooth()` using formula 'y ~ x'

g+geom_point(col="pink",size=4,alpha=1/2)

g+geom_point(size=4,alpha=1/2, aes(color=drv))

g+geom_point(aes(color=drv))+labs(title="Swirls Rules!")+labs(x="Displacement", y="Hwy Milage")

g + geom_point(aes(color = drv),size=2,alpha=1/2) +
    geom_smooth(size=4,linetype=3,method="lm",se=FALSE)
## `geom_smooth()` using formula 'y ~ x'

g + geom_point(aes(color = drv)) + theme_bw(base_family = "serif" )

GGPlot2 Extras

GGPlot2_Extras. (Slides for this and other Data Science courses may be found at github https://github.com/DataScienceSpecialization/courses/. If you care to use them, they must be downloaded as a zip file and viewed locally. This lesson corresponds to 04_ExploratoryAnalysis/ggplot2.)

head(diamonds)
## # A tibble: 6 x 10
##   carat cut       color clarity depth table price     x     y     z
##   <dbl> <ord>     <ord> <ord>   <dbl> <dbl> <int> <dbl> <dbl> <dbl>
## 1 0.23  Ideal     E     SI2      61.5    55   326  3.95  3.98  2.43
## 2 0.21  Premium   E     SI1      59.8    61   326  3.89  3.84  2.31
## 3 0.23  Good      E     VS1      56.9    65   327  4.05  4.07  2.31
## 4 0.290 Premium   I     VS2      62.4    58   334  4.2   4.23  2.63
## 5 0.31  Good      J     SI2      63.3    58   335  4.34  4.35  2.75
## 6 0.24  Very Good J     VVS2     62.8    57   336  3.94  3.96  2.48
qplot(price, data=diamonds)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

qplot(price, data=diamonds, binwidth=18497/30, fill=cut)